RISCV-BOOM
Questions to answer on BOOM
ACA Materials
Taking this
-
Topics:
• How the latest microprocessors work
• Why they are built that way – and what are the alternatives?
• How you can make software that uses the hardware in the best possible way
• How you can make a compiler that does it for you
• How you can design a computer for your problem
• What does a big computer look like?
• What are the fundamental big ideas and challenges in computer architecture?
• What is the scope for theory?
-
Have a look at past papers:
-
Some of the exam questions will be based on the article studied in class
-
In the exams - more of an essay ~> Clear and informed thinking about the question at hand
-
20% coursework
-
Turing award speech from Henessy and Paterson.
-
John Backus “Can Programming be Liberated from the von Neumann Style?” (1979) (Have a read)
-
Opcode is always at the same position in the instruction
-
Source register is \\
-
Branch immediate is a memory offset from the current instruction
-
The “Turing Tax” is a term for the overhead (performance, cost, or energy) of universality
in this sense
-
ASIC - Application-specific integrated circuit
Proper dive 2019-10-15 16:05
-
Improving average memory access time:
-
AMAT is a function of hits and misses (hit time + MissRate * MissPenalty)
-
Types of cache misses:
-
Compulsory - fresh data
-
Capacity - data was evicted because we loaded new data
-
Conflict - Cache is not full, but we still evicted a line because we inserted a line that
had a conflicting index
-
Coherence - data invalidated by another processor or device
-
Wave speculation (Textbook)
-
Victim Cache -- Search the main cache and the victim cache at the same time:
-
A small cache for things that are being thrown away from the main cache
-
Check the victim cache in parallel with the cache
-
Skewed-associative caches are computing hashes of tags and use the hashes as indices:
-
Could be worse as the worst case scenario is more difficult to avoid
-
Hardware Prefetching:
-
As soon as we have a cache miss, we initiate a fetch for the next block
-
Similar to victim cache, since it has a side-cache, but the dataflow is reversed
-
Always check the stream buffer in parallel with the cache
-
Prefetch n+5 cache lines (enough to cover the access latency)
Reading:
-
The Microarchitecture of the Pentium 4 Processor (Hinton et al, Intel Tech Jnl Q1 2001)
-
The SimpleScalar Tool Set, Version 2.0 (Burger and Austin, http://www.simplescalar.com/docs/users_guide_v2.pdf)
-
Wattch: a framework for architectural-level power analysis and optimizations (Brooks et al, ISCA 2000) www.tortolaproject.com/papers/brooks00wattch.pdf
• Papers:
– Instruction issue logic for high-performance, interruptable
pipelined processors. G. S. Sohi, S. Vajapeyam.
International Conference on Computer Architecture, 1987
(http://doi.acm.org/10.1145/30350.30354)
– Towards Kilo-instruction processors. Cristal, Santana,
Valero, Martinez ACM Trans. Architecture and Code
Optimization (http://doi.acm.org/10.1145/1044823.1044825)
• Other simulators:
– Simplescalar: www.simplescalar.com/
– Gem5: http://www.gem5.org
– Liberty: http://liberty.cs.princeton.edu/
– SimFlex: http://parsa.epfl.ch/simflex/
– SIMICS: http://www.windriver.com/products/simics/
ACA_CW1
ACA CW2